218 research outputs found
Recommended from our members
A Haystack Heuristic for Autoimmune Disease Biomarker Discovery Using Next-Gen Immune Repertoire Sequencing Data.
Large-scale DNA sequencing of immunological repertoires offers an opportunity for the discovery of novel biomarkers for autoimmune disease. Available bioinformatics techniques however, are not adequately suited for elucidating possible biomarker candidates from within large immunosequencing datasets due to unsatisfactory scalability and sensitivity. Here, we present the Haystack Heuristic, an algorithm customized to computationally extract disease-associated motifs from next-generation-sequenced repertoires by contrasting disease and healthy subjects. This technique employs a local-search graph-theory approach to discover novel motifs in patient data. We apply the Haystack Heuristic to nine million B-cell receptor sequences obtained from nearly 100 individuals in order to elucidate a new motif that is significantly associated with multiple sclerosis. Our results demonstrate the effectiveness of the Haystack Heuristic in computing possible biomarker candidates from high throughput sequencing data and could be generalized to other datasets
Recommended from our members
Gene Expression Meta-Analysis Reveals Concordance in Gene Activation, Pathway, and Cell-Type Enrichment in Dermatomyositis Target Tissues.
ObjectiveWe conducted a comprehensive gene expression meta-analysis in dermatomyositis (DM) muscle and skin tissues to identify shared disease-relevant genes and pathways across tissues.MethodsSix publicly available data sets from DM muscle and two from skin were identified. Meta-analysis was performed by first processing data sets individually then cross-study normalization and merging creating tissue-specific gene expression matrices for subsequent analysis. Complementary single-gene and network analyses using Significance Analysis of Microarrays (SAM) and Weighted Gene Co-expression Network Analysis (WGCNA) were conducted to identify genes significantly associated with DM. Cell-type enrichment was performed using xCell.ResultsThere were 544 differentially expressed genes (FC ≥ 1.3, q < 0.05) in muscle and 300 in skin. There were 94 shared upregulated genes across tissues enriched in type I and II interferon (IFN) signaling and major histocompatibility complex (MHC) class I antigen-processing pathways. In a network analysis, we identified eight significant gene modules in muscle and seven in skin. The most highly correlated modules were enriched in pathways consistent with the single-gene analysis. Additional pathways uncovered by WGCNA included T-cell activation and T-cell receptor signaling. In the cell-type enrichment analysis, both tissues were highly enriched in activated dendritic cells and M1 macrophages.ConclusionThere is striking similarity in gene expression across DM target tissues with enrichment of type I and II IFN pathways, MHC class I antigen-processing, T-cell activation, and antigen-presenting cells. These results suggest IFN-γ may contribute to the global IFN signature in DM, and altered auto-antigen presentation through the class I MHC pathway may be important in disease pathogenesis
Characterizing pre-transplant and post-transplant kidney rejection risk by B cell immune repertoire sequencing.
Studying immune repertoire in the context of organ transplant provides important information on how adaptive immunity may contribute and modulate graft rejection. Here we characterize the peripheral blood immune repertoire of individuals before and after kidney transplant using B cell receptor sequencing in a longitudinal clinical study. Individuals who develop rejection after transplantation have a more diverse immune repertoire before transplant, suggesting a predisposition for post-transplant rejection risk. Additionally, over 2 years of follow-up, patients who develop rejection demonstrate a specific set of expanded clones that persist after the rejection. While there is an overall reduction of peripheral B cell diversity, likely due to increased general immunosuppression exposure in this cohort, the detection of specific IGHV gene usage across all rejecting patients supports that a common pool of immunogenic antigens may drive post-transplant rejection. Our findings may have clinical implications for the prediction and clinical management of kidney transplant rejection
Comprehensive analysis of normal adjacent to tumor transcriptomes.
Histologically normal tissue adjacent to the tumor (NAT) is commonly used as a control in cancer studies. However, little is known about the transcriptomic profile of NAT, how it is influenced by the tumor, and how the profile compares with non-tumor-bearing tissues. Here, we integrate data from the Genotype-Tissue Expression project and The Cancer Genome Atlas to comprehensively analyze the transcriptomes of healthy, NAT, and tumor tissues in 6506 samples across eight tissues and corresponding tumor types. Our analysis shows that NAT presents a unique intermediate state between healthy and tumor. Differential gene expression and protein-protein interaction analyses reveal altered pathways shared among NATs across tissue types. We characterize a set of 18 genes that are specifically activated in NATs. By applying pathway and tissue composition analyses, we suggest a pan-cancer mechanism of pro-inflammatory signals from the tumor stimulates an inflammatory response in the adjacent endothelium
Recommended from our members
Tracing diagnosis trajectories over millions of patients reveal an unexpected risk in schizophrenia.
The identification of novel disease associations using big-data for patient care has had limited success. In this study, we created a longitudinal disease network of traced readmissions (disease trajectories), merging data from over 10.4 million inpatients through the Healthcare Cost and Utilization Project, which allowed the representation of disease progression mapping over 300 diseases. From these disease trajectories, we discovered an interesting association between schizophrenia and rhabdomyolysis, a rare muscle disease (incidence < 1E-04) (relative risk, 2.21 [1.80-2.71, confidence interval = 0.95], P-value 9.54E-15). We validated this association by using independent electronic medical records from over 830,000 patients at the University of California, San Francisco (UCSF) medical center. A case review of 29 rhabdomyolysis incidents in schizophrenia patients at UCSF demonstrated that 62% are idiopathic, without the use of any drug known to lead to this adverse event, suggesting a warning to physicians to watch for this unexpected risk of schizophrenia. Large-scale analysis of disease trajectories can help physicians understand potential sequential events in their patients
Non-Synonymous and Synonymous Coding SNPs Show Similar Likelihood and Effect Size of Human Disease Association
Many DNA variants have been identified on more than 300 diseases and traits using Genome-Wide Association Studies (GWASs). Some have been validated using deep sequencing, but many fewer have been validated functionally, primarily focused on non-synonymous coding SNPs (nsSNPs). It is an open question whether synonymous coding SNPs (sSNPs) and other non-coding SNPs can lead to as high odds ratios as nsSNPs. We conducted a broad survey across 21,429 disease-SNP associations curated from 2,113 publications studying human genetic association, and found that nsSNPs and sSNPs shared similar likelihood and effect size for disease association. The enrichment of disease-associated SNPs around the 80th base in the first introns might provide an effective way to prioritize intronic SNPs for functional studies. We further found that the likelihood of disease association was positively associated with the effect size across different types of SNPs, and SNPs in the 3′untranslated regions, such as the microRNA binding sites, might be under-investigated. Our results suggest that sSNPs are just as likely to be involved in disease mechanisms, so we recommend that sSNPs discovered from GWAS should also be examined with functional studies
CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction
CONTRAST is a gene predictor that directly incorporates information from multiple alignments and uses discriminative machine learning techniques to give large improvements in prediction over previous methods
- …